Extracting and Rendering Representative Sequences

نویسندگان

  • Alexis Gabadinho
  • Gilbert Ritschard
  • Matthias Studer
  • Nicolas S. Müller
چکیده

Abstract. This paper is concerned with the summarization of a set of categorical sequences. More specifically, the problem studied is the determination of the smallest possible number of representative sequences that ensure a given coverage of the whole set, i.e. that have together a given percentage of sequences in their neighbourhood. The proposed heuristic for extracting the representative subset requires as main arguments a pairwise distance matrix, a representativeness criterion and a distance threshold under which two sequences are considered as redundant or, identically, in the neighborhood of each other. It first builds a list of candidates using a representativeness score and then eliminates redundancy. We propose also a visualization tool for rendering the results and quality measures for evaluating them. The proposed tools have been implemented in our TraMineR R package for mining and visualizing sequence data and we demonstrate their efficiency on a real world example from social sciences. The methods are nonetheless by no way limited to social science data and should prove useful in many other domains.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Summarizing Sets of Categorical Sequences - Selecting and Visualizing Representative Sequences

This paper is concerned with the summarization of a set of categorical sequence data. More specifically, the problem studied is the determination of the smallest possible number of representative sequences that ensure a given coverage of the whole set, i.e. that have together a given percentage of sequences in their neighborhood. The goal is to yield a representative set that exhibits the key f...

متن کامل

Searching for typical life trajectories applied to childbirth histories?

Abstract. We address, in this chapter, the identification of typical patterns that best characterize a set of sequences. More specifically, we focus on data-driven methods that search for the typical patterns among the observed sequences. In life course studies, such typical sequences serve, for instance, to describe ideal-type life trajectories; i.e., the common way(s) of organizing our life. ...

متن کامل

Quantitative Evaluation of the Lateral Sealing Ability of Normal Faults in Siliciclastic Sequences: Implication for Fault Trap in Well Gang 64, in the West Qikou Sag, China

The lateral sealing ability of a normal fault is a major factor in creating hydrocarbon traps. Therefore, a methodology for assessing the sealing ability of faults in the siliciclastic sequences of subsidence basins has been established; moreover, by using this methodology, the uncertainty inherent in hydrocarbon exploration can be decreased. Moreover, the petrophysical properties of fault roc...

متن کامل

Tensor Clustering for Rendering Many-Light Animations

Rendering animations of scenes with deformable objects, camera motion, and complex illumination, including indirect lighting and arbitrary shading, is a long-standing challenge. Prior work has shown that complex lighting can be accurately approximated by a large collection of point lights. In this formulation, rendering of animation sequences becomes the problem of efficiently shading many surf...

متن کامل

Video Summarization Using R-Sequences

I n this paper, we propose a new method of temporal summarization of digital video. First, we address the problem of extracting a fixed number of representative frames to summarize a given digital video. To solve it, we have devised an algorithm called content-based adaptive clustering (CBAC). In our algorithm, shot boundary detection is not needed. Video frames are treated as points in the mul...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010